Deploying a Model and Predicting with Cloud Machine Learning Engine

This notebook is the final step in a series of notebooks for doing machine learning on cloud. The previous notebook, demonstrated evaluating a model. In a real-world scenario, it is likely that there are multiple evaluation datasets, as well as multiple models that need to be evaluated, before there is a model suitable for deployment.

Workspace Setup

The first step is to setup the workspace that we will use within this notebook - the python libraries, and the Google Cloud Storage bucket that will be used to contain the inputs and outputs produced over the course of the steps.


In [1]:
import google.datalab as datalab
import google.datalab.ml as ml
import mltoolbox.regression.dnn as regression
import os
import requests
import time

The storage bucket was created earlier. We'll re-declare it here, so we can use it.


In [2]:
storage_bucket = 'gs://' + datalab.Context.default().project_id + '-datalab-workspace/'
storage_region = 'us-central1'

workspace_path = os.path.join(storage_bucket, 'census')
training_path = os.path.join(workspace_path, 'training')

model_name = 'census'
model_version = 'v1'

Model

Lets take a quick look at the model that was previously produced as a result of the training job. This is the model that was evaluated, and is going to be deployed.


In [3]:
!gsutil ls -r {training_path}/model


gs://cloud-ml-users-datalab-workspace/census/training/model/:
gs://cloud-ml-users-datalab-workspace/census/training/model/
gs://cloud-ml-users-datalab-workspace/census/training/model/saved_model.pb

gs://cloud-ml-users-datalab-workspace/census/training/model/assets.extra/:
gs://cloud-ml-users-datalab-workspace/census/training/model/assets.extra/
gs://cloud-ml-users-datalab-workspace/census/training/model/assets.extra/features.json
gs://cloud-ml-users-datalab-workspace/census/training/model/assets.extra/schema.json

gs://cloud-ml-users-datalab-workspace/census/training/model/variables/:
gs://cloud-ml-users-datalab-workspace/census/training/model/variables/
gs://cloud-ml-users-datalab-workspace/census/training/model/variables/variables.data-00000-of-00001
gs://cloud-ml-users-datalab-workspace/census/training/model/variables/variables.index

Deployment

Cloud Machine Learning Engine provides APIs to deploy and manage models. The first step is to create a named model resource, which can be referred to by name. The second step is to deploy the trained model binaries as a version within the model resource.

NOTE: These steps can take a few minutes.


In [8]:
!gcloud ml-engine models create {model_name} --regions {storage_region}

In [9]:
!gcloud ml-engine versions create {model_version} --model {model_name} --origin {training_path}/model


Creating version (this might take a few minutes)......done.

At this point the model is ready for batch prediction jobs. It is also automatically exposed as an HTTP endpoint for performing online prediction.

Online Prediction

Online prediction is accomplished by issuing HTTP requests to the specific model version endpoint. Instances to be predicted are formatted as JSON in the request body. The structure of instances depend on the model. The census model in this sample was trained using data formatted as CSV, and so the model expects inputs as CSV formatted strings.

Prediction results are returned as JSON in the response.

HTTP requests must contain an OAuth token auth header to succeed. In the Datalab notebook, the OAuth token corresponding to the environment is accessible without a requiring OAuth flow. Actual applications will need to determine the best strategy for acquringing OAuth tokens, generally using Application Default Credentials.


In [10]:
api = 'https://ml.googleapis.com/v1/projects/{project}/models/{model}/versions/{version}:predict'
url = api.format(project=datalab.Context.default().project_id,
                 model=model_name,
                 version=model_version)

headers = {
  'Content-Type': 'application/json',
  'Authorization': 'Bearer ' + datalab.Context.default().credentials.get_access_token().access_token
}

body = {
  'instances': [
    '490,64,2,0,1,0,2,8090,015,01,1,00590,00500,1,18,0,2,1',
    '1225,32,5,0,4,5301,2,9680,015,01,1,00100,00100,1,21,2,1,1',
    '1226,30,1,0,1,0,2,8680,020,01,1,00100,00100,1,16,0,2,1'
  ]
}

response = requests.post(url, json=body, headers=headers)
predictions = response.json()['predictions']

predictions


Out[10]:
[{u'SERIALNO': u'490', u'predicted': 26.395479202270508},
 {u'SERIALNO': u'1225', u'predicted': 68.57681274414062},
 {u'SERIALNO': u'1226', u'predicted': 13.854779243469238}]

It is quite simple to issue these requests using your HTTP library of choice. Actual applications should include the logic to handle errors, including retries.

Batch Prediction

While online prediction is optimized for low-latency requests over small lists of instances, batch prediction is designed for high-throughput prediction for large datasets. The same model can be used for both.

Batch prediction jobs can also be submitted via the API. They are easily submitted via the gcloud tool as well.


In [11]:
%file /tmp/instances.csv
490,64,2,0,1,0,2,8090,015,01,1,00590,00500,1,18,0,2,1
1225,32,5,0,4,5301,2,9680,015,01,1,00100,00100,1,21,2,1,1
1226,30,1,0,1,0,2,8680,020,01,1,00100,00100,1,16,0,2,1


Writing /tmp/instances.csv

In [12]:
prediction_data_path = os.path.join(workspace_path, 'data/prediction.csv')

In [13]:
!gsutil -q cp /tmp/instances.csv {prediction_data_path}

Each batch prediction job must have a unique name within the scope of a project. The specified name below may need to be changed if you are re-running this notebook.


In [14]:
job_name = 'census_prediction_' + str(int(time.time()))
prediction_path = os.path.join(workspace_path, 'predictions')

NOTE: A batch prediction job can take a few minutes, due to overhead of provisioning resources, which is reasonable for large jobs, but can far exceed the time to complete a tiny dataset such as the one used in this sample.


In [15]:
!gcloud ml-engine jobs submit prediction {job_name} --model {model_name} --version {model_version} --data-format TEXT --input-paths {prediction_data_path} --output-path {prediction_path} --region {storage_region}


createTime: '2017-03-07T20:00:36Z'
jobId: census_prediction_1488916830
predictionInput:
  dataFormat: TEXT
  inputPaths:
  - gs://cloud-ml-users-datalab-workspace/census/data/prediction.csv
  outputPath: gs://cloud-ml-users-datalab-workspace/census/predictions
  region: us-central1
  runtimeVersion: '1.0'
  versionName: projects/cloud-ml-users/models/census/versions/v1
predictionOutput:
  outputPath: gs://cloud-ml-users-datalab-workspace/census/predictions
state: QUEUED

The status of the job can be inspected in the Cloud Console. Once it is completed, the outputs should be visible in the specified output path.


In [16]:
!gsutil ls {prediction_path}


gs://cloud-ml-users-datalab-workspace/census/predictions/prediction.errors_stats-00000-of-00001
gs://cloud-ml-users-datalab-workspace/census/predictions/prediction.results-00000-of-00002
gs://cloud-ml-users-datalab-workspace/census/predictions/prediction.results-00001-of-00002

In [17]:
!gsutil cat {prediction_path}/prediction*


{"SERIALNO": "490", "predicted": 26.395479202270508}
{"SERIALNO": "1225", "predicted": 68.57681274414062}
{"SERIALNO": "1226", "predicted": 13.854779243469238}

Conclusion

This covers the end-to-end workflow from data preparation to training to deployment and prediction using a combination of the Datalab ML Toolbox with out-of-box models, Cloud Machine Learning Engine, BigQuery and Dataflow.